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We consider the problem of information fusion from multiple sensors of differ- 
ent types with the objective of improving the confidence of inference tasks, such 
as object classification, performed from the data collected by the sensors. We 
propose a novel technique based on distributed belief aggregation using a multi- 
agent prediction market to solve this information fusion problem. To monitor 
the improvement in the confidence of the object classification as well as to dis- 
incentivize agents from misrcporting information, we have introduced a market 
maker that rewards the agents instantaneously as well as at the end of the infer- 
ence task, based on the quality of the submitted reports. We have implemented 
the market maker's reward calculation in the form of a scoring rule and have 
shown analytically that it incentivizes truthful revelation or accurate reporting 
by each agent. We have experimentally verified our technique for multi-sensor 
■ information fusion for an automated landmine detection scenario. Our experi- 

t^- \ mental results show that, for identical data distributions and settings, using our 

information aggregation technique increases the accuracy of object classification 
favorably as compared to two other commonly used techniques for information 
fusion for landmine detection. 



o . 

1 Introduction 



Information fusion from multiple sensors has been a central research topic in 
sensor-based systems [T7] and recently several multi-agent techniques [IB] have 
been proposed to address this problem. Most of the solutions for multi-sensor 
information fusion and processing are based on Bayesian inference techniques 
[5J[T21[IS]. While such techniques have been shown to be very effective, we in- 
vestigate a complimentary problem where sensors can behave in a self-interested 
manner. Such self-interested behavior can be motivated by malicious nodes that 
might have been planted into the system to subvert its operation, or, by nor- 
mal sensor nodes attempting to give an illusion of efficient performance when 
they do not have enough resources (e.g., battery power) to perform accurate 
measurements. To address this problem, we describe a market-based aggrega- 
tion technique called a prediction market for multi-sensor information fusion 
that includes a utility driven mechanism to motivate each sensor, through its 
associated agent, to reveal accurate reports. 

To motivate our problem we describe a distributed automated landmine 



1 



detection scenario used for humanitarian demining. An environment contains 
different buried objects, some of which could potentially be landmines. A set of 
robots, each equipped with one of three types of landmine detection sensor such 
as a metal detector (MD), or a ground penetrating radar (GPR) or an infra-red 
(IR) heat sensor, are deployed into this environment. Each robot is capable 
of perceiving certain features of a buried object through its sensor such as the 
object's metal content, area, burial depth, etc. However, the sensors give noisy 
readings for each perceived feature depending on the characteristics of the object 
as well as on the characteristics of the environment (e.g., moisture content, 
ambient temperature, sunlight, etc.). Consequently, a sensor that works well 
in one scenario, fails to detect landmines in a different scenario, and, instead 
of a single sensor, multiple sensors of different types, possibly with different 
detection accuracies can detect landmines with higher certainty [5] . Within this 
scenario, the central question that we intend to answer is: given an initial set 
of reports about the features of a buried object, what is a suitable set (number 
and type) of sensors to deploy over a certain time window to the object, so 
that, over this time window, the fused information from the different sensors 
successively reduces the uncertainty in determining the object's type. 

Our work in this paper is based on the insight that the scenario illustrated 
above, of fusing information from multiple sources to predict the outcome of an 
initially unknown object, is analogous to the problem of aggregating the beliefs 
of different humans to forecast the outcome of an initially unknown event. Such 
forecasting is frequently encountered in many problems such as predicting the 
outcome of geo-political events, predicting the outcome of financial instruments 
like stocks, etc. Recently, a market-based model called prediction market has 
been shown to be very successful in aiding humans with such predictions and 
with decision-making [TJ [2j [TH [18] . Building on these models, in this paper, we 
describe a multi-agent prediction market for multi-sensor information fusion. 
Besides being an efficient aggregation mechanism, using prediction markets gives 
us several useful features - a mathematical formulation called a scoring rule 
that deters malicious sensors from misreporting information, a regression-based 
belief update mechanism for the sensor agents for incorporating the aggregated 
beliefs (or information estimates) of other sensors into their own calculation, and 
the ability to incorporate an autonomous decision maker that uses expert-level 
domain knowledge to make utility maximizing decisions to deploy additional 
sensors appropriately to improve the detection of an object. Our experimental 
results illustrated with a landmine detection scenario while using identical data 
distributions and settings, show that the information fusion performed using 
our technique reduces the root mean squared error by 5 — 13% as compared to 
a previously studied technique for landmine data fusion using the Dempster- 
Shafer theory [TU] and by 3 — 8% using distributed data fusion technique [5] . 

2 Related Work 

Multi-agent Information Fusion. Multi-agent systems have been used to 
solve various sensor network related problems and an excellent overview is given 
in [16 . In the direction of multi-sensor information processing, significant works 
include the use of particle filters [15], distributed data fusion (DDF) architecture 
along with its extension, the Bayesian DDF [5] [5] , Gaussian processes [T3] and 
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mobile agent-based information fusion [T5]. For most of the application do- 
mains described in these works such as water-tide height measurement, wind 
speed measurement, robot tracking and localization, etc., self-interested be- 
havior by the sensors is not considered a crucial problem. For our illustrated 
application domain of landmine detection, decision-level fusion techniques have 
been reported to be amenable for scenarios where the sensor types are differ- 
ent from each other, and, non-statistical decision-level fusion techniques, such 
as Dempster-Shafer theory [TO], fuzzy logic [3], and rule-based fusion techniques 
[5] have been reported to generalize well. However, in contrast to our work, 
these techniques assume that sensors are fully cooperative and never behave 
sclf-interestedly by misreporting information. In |12j . the authors have ob- 
served that most sensor-based information aggregation techniques either do not 
consider malicious behavior or use high-overhead, cryptographic techniques to 
combat it. To deter false reports by sensor nodes in a data aggregation setting, 
they propose various lower overhead reputation-based schemes. Our predic- 
tion market-based information aggregation technique is complimentary to such 
reputation-based aggregation techniques. 

Decision-Making using Prediction Markets. A prediction market is 
a market-based aggregation mechanism that is used to combine the opinions 
on the outcome of a future, real- world event from different people, called the 
market's traders and forecast the event's possible outcome based on their ag- 
gregated opinion. Recently, multi-agent systems have been used [TJ [7J [Tl] to 
analyze the operation of prediction markets, where the behaviors of the mar- 
ket's participants are implemented as automated software agents. The seminal 
work on prediction market analysis |18) has shown that the mean belief values 
of individual traders about the outcome of a future event corresponds to the 
event's market price. The basic operation rules of a prediction market are simi- 
lar to those of a continuous double auction, with the role of the auctioneer being 
taken up by an entity called the market maker that runs the prediction market. 
Hanson [5] developed a mechanism, called a scoring rule, that can be used by 
market makers to reward traders for making and improving a prediction about 
the outcome of an event, and, showed that if a scoring rule is proper or incentive 
compatible, then it can serve as an automated market maker. Recently, authors 
in [21 [T3] have theoretically analyzed the properties of prediction markets used 
for decision making. In [2], the authors analyzed the problem of a decision 
maker manipulating a prediction market and proposed a family of scoring rules 
to address the problem. In [2] , the authors extended this work by allowing ran- 
domized decision rules, considering multiple possible outcomes and providing 
a simple test to determine whether the scoring rule is proper for an arbitrary 
decision rule-scoring rule pair. In this paper, we use a prediction market for de- 
cision making, but in contrast to previous works we consider that the decision 
maker can make multiple, possibly improved decisions over an event's duration, 
and, the outcome of an event is decided independently, outside the market, 
and not influenced by the decision maker's decisions. Another contribution our 
paper makes is a new, proper scoring rule, called the payment function, that 
incentivizes agents to submit truthful reports. 



3 



3 Problem Formulation 



Let L be a set of objects. Each object has certain features that determine its 
type. We assume that there are / different features and m different object types. 
Let $ = {(f>i,(f>2, —,<pf} denote the set of object features and = {0 1 ,8 2 , m } 
denote the set of object types. The features of an object / € L is denoted by 
l<s, C $ and its type is denoted by lg £ G. As illustrated in the example given 
in Section [TJ Z$ can be perceived, albeit with measurement errors, through 
sensors, and, our objective is to determine lg as accurately as possible from 
the perceived but noisy values of Let A(0) = {(S(9i), 6(62), S(0 m )) : 
5{9i) € [0, 1], J27Li = 1}: denote the set of probability distributions over 
the different object types. For convenience of analysis, we assume that when 
the actual type of object I, lg = 6j, its (scalar) type is expanded into a Tri- 
dimensional probability vector using the function vec : — y [0, l] m : vecj = 
ljVeCi^Lj = 0, which has 1 as its j-th component corresponding to Vs type 0j 
and for all other components. 

Let A denote a set of agents (sensors) and A 1 /^ C A denote the subset of 
agents that are able to perceive the object I's features on their sensors at time t. 
Based on the perceived object features, agent a S ^4*'J p at time t reports a belief 
as a probability distribution over the set of object types, which is denoted as 
b a ' t<l € A(0). The beliefs of all the agents are combined into a composite belief, 
B ul = Agg a( . A t,i (b a < M ), and let ©*•' : B*>' -)• A(0) denote a function that 
computes a probability distribution over object types based on the aggregated 
agent beliefs. Within this setting we formulate the object classification problem 
as a decision making problem in the following manner: given an object I and an 
initial aggregated belief B* ,z calculated from one or more agent reports for that 
object, determine a set of additional agents (sensors) that need to be deployed 
at object I such that the following constraint is satisfied: 

min RAISE (©*>', vec(l e )) , for t = 1, 2, ....T (1) 

where T is the time window for classifying an object I and RMSE is the root 
mean square error given by RMSE(x,y) — ^77=^ ■ In other words, at every 
time step t, the decision maker tries to select a subset of agents such that the 
root mean square error (RMSE) between the estimated type of object I and its 
actual type is successively minimized. 

The major components of the object classification problem described above 
consists of two parts: integrating the reports from the different sensors and 
making sensor deployment decisions based on those reports so that the objective 
function given in Equation [T] is satisfied. To address the first part, we have 
used distributed information aggregation with a multi-agent prediction market, 
while for the latter we have used an expected utility maximizing decision-making 
framework. A schematic showing the different components of our system and 
their interactions is shown in Figure [T] and explained in the following sections. 

3.1 Sensor Agents 

As mentioned in Section [TJ there is a set of robots in the scenario and each 
robot has an on-board sensor for analyzing the objects in the scenario. Different 
robots can have different types of sensors and sensors of the same type can have 
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Prediction Market 



Figure 1: The different components of the prediction market for decision making 
and the interactions between them. 



different degrees of accuracy determined by their cost. Every sensor is associated 
with a software agent that runs on-board the robot and performs calculations 
related to the data sensed by the robot's sensor. In the rest of the paper, we 
have used the terms sensor and agent interchangeably. For the ease of notation, 
we drop the subscript I corresponding to an object for the rest of this section. 
When an object is within the sensing range of a sensor (agent) a at time t, the 
sensor observes the object's features and its agent receives this observation in the 
form of an information signal g a ' —< g\, ■■■,gf > that is drawn from the space 
of information signals G C A(0). The conditional probability distribution of 
object type Oj given an information signal g G G, P(9j\g) : G — > [0,1], is 
constructed using domain knowledge [3J 1101 111] within a Bayesian network and 
is made available to each agent. Agent a then updates its belief distribution 
h a t using the following equation: 

b Q < 4 = w bel ■ P(e|.g a < 4 ) + (1 - w bel ) ■ B*, (2) 

where B 4 is the belief value vector aggregated from all sensor reports. 

Agent Rewards. Agents behave in a self-interested manner to ensure that 
they give their 'best' report using their available resources including sensor, 
battery power, etc. However, some agents can behave maliciously, either being 
planted or compromised to infiltrate the system and subvert the object classi- 
fication process, or, they might be trying to give an illusion of being efficient 
when they do not have sufficient resources to give an accurate report. An agent 
a that submits a report at time t, uses its belief distribution b a ' 4 to calculate 
the report r a ' 4 =< r°'', r^j* ><G A(6). An agent can have two strategies 
to make this report - truthful or malicious. If the agent is truthful, its report 
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corresponds to its belief, i.e., r a,t = b a ' 4 . But if it is malicious, it manipulates 
its report to reveal an inaccurate belief. Each agent a can update its report r a '* 
within the time window T by obtaining new measurements from the object and 
using Equation [5] to update its belief. The report from an agent a at time t is 
analyzed by a human or agent expert |10j to assign a weight u> a,t depending on 
the current environment conditions and agent a's sensor type's accuracy under 
those environment conditions (e.g., rainy weather reduces the weight assigned 
to the measurement from an IR heat sensor, or, soil that is high in metal content 
reduces the weight assigned to the measurement from an metal detector). 

To motivate an agent to submit reports, an agent a gets an instantaneous 
reward, p a,t , from the market maker for the report r a,t it submits at time t, cor- 
responding to its instantaneous utility, which is given by the following equation: 

p a >* = F(n*' =1 -*) - C°(r -*), (3) 

where V(n t =1 - *) is the value for making a report with n* = * being the number 
of times the agent a submitted a report up to time t, and, C7 a (r a '*) is the 
cost of making report r a '* for agent a based on the robot's expended time, 
battery power, etc. We denote the agent's value for each report Vin 1 = 1 -- t ) as 
a constant-valued function up to a certain threshold and a linearly decreasing 
function thereafter, to de-incentivize agents from making a large number of 
reports. Agent a's value function is given by the following equation: 

y jyt —l..t <^ ^threshold 

( ra k re3 feoid_„ m J) , otherwise 

where v £ Z + , is a constant value that a gets by submitting reports up to a 
threshold, n threshold [ s the threshold corresponding to the number of reports a 
can submit before its report's value starts decreasing, and, n max is the maximum 
number of reports agent a can submit before V becomes negative. Finally, to 
determine its strategy while submitting its report, an agent selects the strategy 
that maximizes its expected utility obtained from its cumulative reward given 
by Equation [3] plus an expected value of its final reward payment if it continues 
making similar reports up to the object's time window T. 

3.2 Decision Maker Agent 

The decision maker agent's task is to use the composite belief about an object's 
type, B*, given by the prediction market, and take actions to deploy additional 
robots(sensors) based on the value of the objective function given in Equation 
[TJ Let AC denote a set of possible actions corresponding to deploying a certain 
number of robots, and D = {d\,...dh} : dj <E Ac C AC denote the decision 
set of the decision maker. The decision function of the decision maker is given 
by dec : A(6) -> D. Let u^ ec G R m be the utility that the decision maker 
receives by determining an object to be of type 6j and let P(di\6j) be the 
probability that the decision maker makes decision d t £ D given object type Oj. 
P(di\6j) and Uj ec are constructed using domain knowledge [TTJITD]. Given the 
aggregated belief distribution B* at time t, the expected utility to the decision 
maker for taking decision dj at time t is then EU dec (di,'B t ) — Y^JLi P{di\9j) • 
u dec . gt rpj^g jjgpjgjQjj that the decision maker takes at time t, also called its 
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decision rule, is the one that maximizes its expected utility and is given by: 
d* = argmax^ EU dec {d l , B*). 

3.3 Prediction Market 

A conventional prediction market uses the aggregated beliefs of the market's 
participants or traders about the outcome of a future event, to predict the 
event's outcome. The outcome of an event is represented as a binary variable 
(event happens/does not happen). The traders observe information related to 
the event and report their beliefs, as probabilities about the event's outcome. 
The market maker aggregates the traders' beliefs and uses a scoring rule to 
determine a payment or payoff that will be received by each reporting trader. 
In our multi-agent prediction market, traders correspond to sensor agents, the 
market maker agent automates the calculations on behalf of the conventional 
market maker, and, an event in the conventional market corresponds to identi- 
fying the type of a detected object. The time window T over which an object 
is sensed is called the duration of the object in the market. This time window 
is divided into discrete time steps, t — 1,2...T. During each time step, each 
sensor agent observing the object submits a report about the object's type to 
the market maker agent. The market maker agent performs two functions with 
these reports. First, at each time step t, it aggregates the agent reports into 
an aggregated belief about the object, B* G A(9). Secondly, it calculates and 
distributes payments for the sensor agents. It pays an immediate but nominal 
reward to each agent for its report at time step t using Equation [31 Finally, at 
the end of the object's time window T, the market maker also gives a larger pay- 
off to each agent that contributed towards classifying the object's type. The 
calculations and analysis related to these two functions of the market maker 
agent are described in the following sections. 

Final Payoff Calculation. The payoff calculation for a sensor agent is 
performed by the market maker using a decision scoring rule at the end of the 
object's time window. A decision scoring rule [2] is defined as any real valued 
function that takes the agents' reported beliefs, the realized outcome and the 
decisions made by the decision maker as input, and produces a payoff for the 
agent for its reported beliefs, i.e. S : A(9) x 9 x D — > R. We design a scoring 
rule for decision making that is based on how much agent a's final report helped 
the decision maker to make the right decisions throughout the duration of the 
prediction market and by how close the agent a's final report is to actual object 
type. 9ur proposed scoring rule for decision making given that object's true 
type is 9j is given in Equation 2) 

S(rf,d^\9 3 ) = w^.B^log (rf) , (4) 

where, r"'* is the reported belief that agent a submitted at time t for object type 
Oj, c^ 1 '*] is the set consisting of all the decisions that the decision maker took 
related to the object up to the current time t, 9j is the object's true type that was 
revealed at the end the object's time window, log (?""'*) measures the goodness 
of the report at time t relative to the true object type 8j, and, vj(d^ 1:t \9j) is the 
weight, representing how good all the decisions the decision maker took up to 
time t were compared to the true object type 9j. zu(d^ ut \9j) is determined by 
the decision maker and made available to the agents through the market maker. 
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We assume that w(dl 1:t \6j) = E*=i P W I Oj) ■ u f C ' which 

gives the expected 

utility of the decision maker agent for making decision i when the true type of 
the object is Oj. 

Aggregation. Since a sensor agent gets paid both through its immediate 
rewards for making reports during the object's time window and through the 
scoring rule function for decision making at the end of the object's time window, 
we define the total payment that the agent has received by the end of the object's 
time window as a payment function. 

Definition 1. A function ^(r a ' t ,S 1 ' t \9j,n t =1 -- t ) is called a payment func- 
tion if each agent a 's total received payment at the end of the object 's time 
window (when t = T) is 

t 

*(r a <*, d^ , 0j, n* = P ak + S{rf,d^ , 9 S ) (5) 

fc=i 

where p a ' k , S(r"'*, dP~ :t \ 6j) and their components are defined as in Equations^ 
and^ 

Let ^ ave denote a weighted average of the payment function in Equation [5] 
over all the reporting agents, using the report-weights assigned by the expert in 
Section I37T1 as given below: 

t 

k=l OGA* 

aeA* rap 

where A\, ep is the subset of agents that are able to perceive object feature at 
time t and w a,k is the weight assigned to agent a at time k by the expert. To 
calculate an aggregated belief value in a prediction market, Hanson [5] used 
the generalized inverse function of the scoring rule. Likewise, we calculate the 
aggregated belief for our market maker agent by taking the generalized inverse 
of the average payment function given in Equation [7] 

Bj=A fcAf Jb a ' ( ) (7) 
"f(*""-£Ui£. 6 ^ V-*) 

ro(d[ 1: *],0j) 

where S B* is the j-th component of the aggregated belief for object type Oj. 
The aggregated belief vector, B 4 , calculated by the market maker agent is sent 
to the decision maker agent so that it can calculate its expected utility given in 
Section HOI as well as, sent back to each sensor agent that reported the object's 
type till time step t, so that the agent can refine its future reports, if any, using 
this aggregate of the reports from other agents. 
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4 Payment function: Properties 
and Characteristics 

In this section we first show that the payment function is proper, or incentive 
compatible. Then we show that when the market maker uses this payment 
function to reward each agent for its reported beliefs, reporting beliefs truthfully 
is the optimal strategy for each agent. 

We can characterize a proper payment function similar to a proper scoring 
rule. 

Definition 2. A payment function is proper, or incentive compatible, if 

l-^Sd^^y- 1 -') > ^(r^tZ^.^n*- 1 "*), (9) 

Vb a,t 5l .a,t g A ( G )_ 

"J is strictly proper if Equation [§] holds with equality, i.e., iff b at = r 3 '*. 

Payment functions can be shown to be proper by representing them using 
convex functions [2j [4] . To show that our payment function in Equation [5] is 
proper, we characterize it in terms of a convex function, as shown below: 

Theorem 1. A payment function ^ is proper for decision making if 

^(r^,^ 1 ^,^,^- 1 -*) = G(r**) - G'(r^) • (r^) + ^^y, (10) 

where G(r a,t ) is a convex function and G'(r a,t ) is a subgradient of G at point 
r at and P{d l \0 j )>0. 

Proof. Consider a payment function 'J satisfying Equation 1101 We will show 
that \& must be proper for decision making. We will drop the agent and time sub- 
scripts in this proof, and also we will write ^(r, $ 1:t ') (or its element ^(rj, di\6j)) 
instead of full ^(r*'*, dl ut \ 6j, n*" 1 "*). 

h m 

EU(b, b) = £ £ />:,/, M;,«l':/,,. ,/,",) 

i=l j=l 
h m 



??^»H G(b) - G ' (b) - b+ Jw)) 



i=l 3 = 1 

h m 

i=l j=l 

= G(b) - G'(b) • b + G'(b) • b = G(b). 
Since G is convex and G' is its subgradient, we have 

h m 

EU(b,v) =£;x;p(dii^)6 J -*(r J - ) d i |e J ) 

i=l j=l 

h m f G' (r) 

= EE P (^I^ (G(r)-G'(r)T ' ' 



i=l j=l \ 

= G(r)-G'(r)(b-r) 
< G(b) = EU(b,b). 



P(d l \0 J ) 
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Thus, ^ is a proper payment function for decision making. ^> is strictly proper 
payment function and the inequality is strict if G is a strictly convex function. 

□ 

Proposition 1. The payment junction given in Equation^ is proper. 
Proof. Let G(b) = M/(b,b) and ^ 

G' M (b) = P{d i \e j )^{h,d [vt \9 :j ,n t '= 1 - t ). Then we can write the payment func- 
tion as 

h m 

*(b, S ut \ 9j, n* = E E *Wi)&i*(b, d [1:t] , Si, n*- 1 "*) 
»=i j'=i 

ft, m 

-b-X)5^P(di|e J -)*(b,d [1:t] ,e j - J n t ' =1 " t ) 
»=i i=i 

= *(b,d [1:tl ,^-,n t '= 1 - t ). 

We can clearly see that the payment function can be written in the form given 
in Equation [TU] from Theorem [1] Therefore, the payment function "J/ given in 
Equation [3] is a proper payment function. □ 

Agent Reporting Strategy. Assume that agent a's report at time t is 
its final report, then its utility function can be written as u"'' = X)I=i P a,k + 
S(r^' ,d^ ut \6j). Then, agent a's expected utility for object type 9j given its 
reported belief for object type 9j, r?' , and its true belief about object type Oj, 
b"' 4 at time t is 

h 

EU^(rf,bf) = J2 P(d i \9 J )b a /u a / (11) 

i=l 

1 = 1 U=l / 

where P(di\9j) is the probability that the decision maker takes decision d, when 
the object's type is 9j. 

Proposition 2. If agent a is paid according to \t, then it reports its beliefs 
about the object types truthfully. 

Proof. Sensor agent a wants to maximize its expected utility function and solves 
the following program 



h m 

argmax ( ^ ^ P(d i \O j )bj' t 

= 1 j=l 



Y,P a ' h + ™{d [Ut \Oj)log(ry 



.fc=i 
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s-t. £ i= iV =1. 

The Lagrangian is 

(h 771 
i=i j=i 



.fe=i 




The first order conditions are 



0^ = 2^1^ -sr^- - A = 



Substituting r"'* into the second equation above, we have 



w{dP*\O j )b¥Y.UP(d i \O j ) 
A 

h 



= 1 



T ■ — ■ . 

J 3 



□ 



5 Experimental Results 

We have conducted several experiments using our aggregation technique for 
decision-making within a multi-sensor landmine detection scenario described in 
Section [TJ Our environment contains different buried objects, some of which 
are landmines. The true types of the objects are randomly determined at the 
beginning of the simulation. Due to the scarcity of real data related to landmine 
detection, we have used the domain knowledge that was reported in [j2[TUJ[TT] to 
determine object types, object features, sensor agents' reporting costs, decision 
maker agent's decision set, decision maker agent's utility of determining objects 
of different types, and, to construct the probability distributions for P(0j \g) and 
P(di\0j). We report simulation results for root mean squared error (RMSE) 
defined in Section^] and also for number of sensors over time, cost over object 
types, and average utility of the sensors over time. 

Compared Techniques. For comparing the performance of our prediction 
market based object classification techniques, we have used two other well- 
known techniques for information fusion: (a) Dempster-Shafer (D-S) theory for 
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Name 


Value 


Object types 


mine, metallic object(non-mine), 
non-metallic obiect(non-mine) 


Features 


metallic content, object's area, 
object's depth, sensor's position 


Sensor types 


MD, IR, and GPR 


Max no. of sensors 


10 (5MD,3IR,2GPR) 


Max no. of decisions 


14 


T (object identification window) 


10 


v (agent's value if n *' =1 --* < n threshoid^ 


5 


n ma x (max no. of reports before value 


20 


is negative) 




^-threshold (no. of reports before agent's value 


5 


is less than v) 





Table 1: Parameters used for our simulation experiments. 



landmine classification |10) . where a two-level approach based on belief func- 
tions is used. At the first level, the detected object is classified according to 
its metal content. At the second level the chosen level of metal content is fur- 
ther analyzed to classify the object as a landmine or a friendly object. The 
belief update of the sensors that we used for D-S method is the same one we 
have described in Section 13.11 (b) Distributed Data Fusion (DDF) [9] , where 
sensor measurements are refined over successive observations using a tempo- 
ral, Bayesian inference-based, information filter. To compare DDF with our 
prediction market-based technique, we replaced our belief aggregation mecha- 
nism given in Equation [5] with a DDF-based information filter. We compare 
our techniques using some standard evaluation metrics from multi-sensor infor- 
mation fusion |13] : root mean squared error (RMSE) defined as in Section [3l 
normed mean squared errors (NMSE) calculated as: 

NMSE^-veciO,)) = 10 log w ; " Sfe^^gjl! , and, the in- 

formation gain, also known as Kullback-Leibler divergence and relative entropy, 
calculated as: 



6* was calculated using D-S, DDF, and our prediction market technique (O 4 = 
B*). Since the focus of our work is on the quality of information fusion, we will 
concentrate on describing the results for one object. We assume that there are 
three types of sensors, MD (least operation cost, most noisy), IR (intermediate 
operation cost, moderately noisy), and GPR (expensive operation cost, most ac- 
curate). We also assume that there are a total of 5 MD sensors, 3 IR sensors, and 
2 GPR sensors available to the decision maker for classifying this object. Ini- 
tially, the object is detected using one MD sensor. Once the object is detected, 
the time window in the prediction market for identifying the object's type starts. 
The MD sensor sends its report to the market maker in the prediction market 
and the decision maker makes its first decision based on this one report. We 
assume that decision maker's decision (sent to the robot/sensor scheduling algo- 
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Object type 


Time 

steps 


PM 


DDF 


D-S 


Mine 


1 


1(1MD) 


1(1MD) 


1(1MD) 




2 


3(1MD,1IR) 


3(1MD,1GPR) 


3(1IR,1GPR) 




3 


4(1GPR) 


5(1MD,1IR) 


4(1MD) 




4 


5(1MD) 


6(1IR) 


5(1MD) 




5 


6(1MD) 


7(11MD) 


6(1IR) 




6 


7(1IR) 


8(1MD) 


7(1IR) 




7 




9(1IR) 


8(1MD) 


Metallic or 


1 


1(1MD) 


1(1MD) 


1(1MD) 


Friendly for D-S 


2 


3(1MD,1IR) 


4(1MD,1IR,1GPR) 


3(1IR,1GPR) 




3 


4(1GPR) 


5(1MD) 


4(1MD) 




4 


5(1MD) 


6(1IR) 


5(1IR) 




5 


6(1IR) 


7(1MD) 


6dlIRl 

\J \ -L. 1111 1 




6 


7(1MD) 


8(1IR) 


7(1MD) 




7 


8(1IR) 


9(1GPR) 


8(1MD) 




8 




9(1MD) 


8(1MD) 


Non-metallic 


1 


1(1MD) 


1(1MD) 






2 


2(1MD) 


2(1IR) 




3 


3(1IR) 


3(1MD) 




4 


4(1MD) 


4(1GPR) 




5 


5(1IR) 


5(1MD) 




6 


6(1MD) 


6(1IR) 




7 




7(1MD) 



Table 2: Different number of sensors and the sensor types deployed over time 
by a decision maker to classify different types of objects. 



rithm in Figured]) is how many (0 — 3) and what type (MD,IR,GPR) of sensors 
to send to the site of the detected object subsequently. We have considered a 
set of 14 out of all the possible decisions under this setting. From [TT], we derive 
four object features, which are metallic content, area of the object, depth of the 
object, and the position of the sensor. Combinations of the values of these four 
features constitute the signal set G and at each time step, a sensor perceiving 
the object receives a signal g G G. The value of the signal also varies based 
on the robot/sensor's current position relative to the object. We assume that 
the identification of an object stops and the object type is revealed when either 
B*- > 0.95, for any j, or after 10 time steps. The default values for all domain 
related parameters are shown in Table Q] All of our results were averaged over 
10 runs and the error bars indicate the standard deviation over the number of 
runs. 

For our first group of experiments we analyze the performance of our tech- 
nique w.r.t. the variables in our model, such as Wbei and time, and, w.r.t. to 
sensor and object types. We observe that as more information gets sensed for 
the object, the RMSE value, shown in Figure[^a), decreases over time. It takes 
on average 6 — 8 time steps to predict the object type with 95% or greater 
accuracy depending on the object type and the value of Wbei ■ We also observe 
that our model performs the best with Wbei = 0.5 (in Equation [2]) , when the 
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Number of Time Steps Number of Time Steps Object Type Number of Time Steps 



abed 

Figure 2: RMSE for different values of Wb e i{&), Average sensors' utilities for 
different sensor types(b), Cost for different object types(c), RMSE for sensors' 
reports averaged over sensor types (d). 



agent equally incorporates its private signal and also the market's aggregated 
belief at each time step into its own belief update. Figure [U[b) shows the av- 
erage utility of the agents based on their type. We can see that MD sensors 
get more utility because their costs of calculating and submitting reports are 
generally less, whereas GPR sensors get the least utility because they encounter 
the highest cost. This result is further verified in Figure [He) where we can see 
the costs based on sensor types and also based on object types. We observe that 
detecting a metallic object that is not a mine has the highest cost. We posit 
that it is because both MD and IR sensors can detect metallic content in the 
object and extra cost is due to the time and effort spent differentiating metallic 
object from a mine. Although most of the mines are metallic [TUHH], we can 
see that the cost of detecting a mine and a non-metallic object are similar be- 
cause we require a prediction of at least 95%. Due to the sensitive nature of the 
landmine detection problem, it is important to ensure that even a non-metallic 
object is not a mine even if we encounter higher costs. However, despite MD's 
high utility (Figure [UJb)) and low cost (Figure [He)), its error of classifying the 
object type is the largest, as can be seen from Figure ^d). 

In Tabled we show how the decision maker's decisions using our prediction 
market technique results in the deployment of different numbers and types of 
sensors over the time window of the object. We report the results for the value of 
belief update weight Wbel — 0.5(used in Equation [5]) while using our prediction 
market model, as well as using D-S and DDF. We see that non-metallic object 
classification requires less number of sensors as both MD and IR sensors can 
distinguish between metallic vs. non-metallic objects, and so, deploying just 
these two types of sensors can help to infer that the object is not a mine. In 
contrast, metallic objects require more time to get classified as not being a mine 
because more object features using all three sensor types need to be observed. 
We also observe that on average our aggregation technique using prediction 
market deploys a total of 6 — 8 sensors and detects the object type with at least 
95% accuracy in 6 — 7 time steps, while the next best compared DDF technique 
deploys a total of 7 — 9 sensors and detects the object type with at least 95% 
accuracy in 7 — 8 time steps. 

Our results shown in Figure (3{a) illustrate that the RMSE using our PM- 
based technique is below the RMSEs using D-S and DDF by an average of 8% 
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Number of Time Steps Number of Time Steps Number of Time Steps 

a b c 

Figure 3: Comparison of our Prediction Market-based information aggregation 
with Dempster-Shafer and Distributed Data Fusion Techniques using different 
metrics: RMSE(a), NMSE(b), Information gain(c). 



and 5% respectively. Figure [2jb) shows that the NMSE values using our PM- 
based technique is 18% and 23% less on average than D-S and DDF techniques 
respectively. Finally, in Figure [5{c) we observe that the information gain for 
our PM-based technique is 12% and 17% more than D-S and DDF methods 
respectively. 

6 Conclusions 

In this paper, we have described a sensor information aggregation technique 
for object classification with a multi-agent prediction market and developed a 
payment function used by the market maker to incentivize truthful revelation 
by each agent. Currently, the rewards given by the market maker agent to the 
sensor agents are additional side payments incurred by the decision maker. In 
the future we plan to investigate a payment function that can achieve budget 
balance. We are also interested in integrating our decision making problem with 
the problem of scheduling robots(sensors), and, incorporating the costs to the 
overall system into the decision-making costs. Another direction we plan to 
investigate in the future is a problem of minimizing the time to detect an object 
in addition to the accuracy of detection. Lastly, we plan to incorporate our 
aggregation technique into the experiments with real robots. 
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